AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Zero-shot video classification

# Zero-shot video classification

Xclip Large Patch14 Kinetics 600
MIT
X-CLIP is an extended version of CLIP for general video-language understanding, trained on video-text pairs through contrastive learning.
Text-to-Video Transformers English
X
microsoft
124
5
Xclip Base Patch16 Zero Shot
MIT
X-CLIP is a minimalist extension of CLIP for general video-language understanding, trained contrastively on (video, text) pairs, suitable for zero-shot, few-shot, or fully supervised video classification as well as video-text retrieval tasks.
Text-to-Video Transformers English
X
microsoft
5,045
24
Xclip Base Patch16
MIT
X-CLIP is an extended version of CLIP for general video-language understanding, trained via contrastive learning on (video, text) pairs, suitable for tasks like video classification and video-text retrieval.
Text-to-Video Transformers English
X
microsoft
1,647
4
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase